1 AT&T

Note: Same as North DAG collection. Origin Notes: Originally collected by Stephen North at AT&T Bell Labs. The original link from 1995 is broken: ftp://ftp.research.att.com/dist/drawdag. Di Battista et al. modified the dataset by removing isomorphic graphs, connecting disconnected graphs, and removing cycles. graph features handled: Acyclic, Directed edges Graph features in papers: generic,large,directed edges,layered graphs,n-layers,DAG,hierarchical,layered graphs,n-layers,layered graphs,n-layers,generic,bipartite,layered graphs Origin Paper: Drawing Directed Acyclic Graphs: An Experimental Study (https://www.notion.so/Drawing-Directed-Acyclic-Graphs-An-Experimental-Study-7f730ceea4a744e08bf091e8c23e8a95?pvs=21) Originally found at: http://www.graphdrawing.org/data.html Size: 1277 graphs, 10 to 100 nodes, 9 to 241 edges Number of Graphs: 1277 format: GraphML Child collections: North DAGs (North%20DAGs%20a58f7143ef524c8a8c737df90162d3fb.md) Appeared in years: 2004,2005,2016,2018,2019,2011,1996,2017 Type of Collection: Uniform Benchmark is it stored properly?: No must be analyzed: Yes In repo?: Yes Related to Literature - Algorithm (1) (Dataset tag relations): Drawing Large Graphs with a Potential-Field-Based Multilevel Algorithm (https://www.notion.so/Drawing-Large-Graphs-with-a-Potential-Field-Based-Multilevel-Algorithm-27a6b266bb2c4c92976ed04d2afe9bed?pvs=21), Simple and Efficient Bilayer Cross Counting (https://www.notion.so/Simple-and-Efficient-Bilayer-Cross-Counting-6f978d8b0ceb4a6eb7df52ed82999861?pvs=21), Drawing Large Graphs with a Potential-Field-Based Multilevel Algorithm (https://www.notion.so/Drawing-Large-Graphs-with-a-Potential-Field-Based-Multilevel-Algorithm-a4fe70c68ad64ed3849c47d95afc8798?pvs=21), An Experimental Comparison of Fast Algorithms for Drawing General Large Graphs (https://www.notion.so/An-Experimental-Comparison-of-Fast-Algorithms-for-Drawing-General-Large-Graphs-bbb7bb7d51d84d109030dee3c06d895d?pvs=21), A Natural Quadratic Approach to the Generalized Graph Layering Problem (https://www.notion.so/A-Natural-Quadratic-Approach-to-the-Generalized-Graph-Layering-Problem-bc71a4b3349649668c28d843a3e69037?pvs=21), Advances in the Planarization Method: Effective Multiple Edge Insertions (https://www.notion.so/Advances-in-the-Planarization-Method-Effective-Multiple-Edge-Insertions-c518ce875daa4fe7b003ad506eb9a347?pvs=21), Large-Graph Layout with the Fast Multipole Multilevel Method (https://www.notion.so/Large-Graph-Layout-with-the-Fast-Multipole-Multilevel-Method-def1fdc467f441abb94868eccd8a5a34?pvs=21), A Flow Formulation for Horizontal Coordinate Assignment with Prescribed Width (https://www.notion.so/A-Flow-Formulation-for-Horizontal-Coordinate-Assignment-with-Prescribed-Width-e1369233a80f4ae8a1ffbedd48c05be2?pvs=21), Compact Layered Drawings of General Directed Graphs (https://www.notion.so/Compact-Layered-Drawings-of-General-Directed-Graphs-f985ab75f9cb40d382f62f61eeff25c7?pvs=21) cleaned format?: Yes duplicate?: No link works?: Yes Added in paper: Yes OSF link json: https://files.osf.io/v1/resources/j7ucv/providers/osfstorage/64d90e87803e0c0b04558c1f Origin paper plaintext: Drawing Directed Acyclic Graphs: An Experimental Study Page id: e86f130c42344169a9d75a61abc7e487 unavailable/skip: No Cleaned ALL data: No OSF link gexf: https://files.osf.io/v1/resources/j7ucv/providers/osfstorage/64d949574cf748107605564e OSF link gml: https://files.osf.io/v1/resources/j7ucv/providers/osfstorage/64d96e3b1101aa0ea66a0c45 OSF link graphml: https://files.osf.io/v1/resources/j7ucv/providers/osfstorage/64d970d31101aa0ea36a0cb0 first look: No sparkline data: {‘min’: 10, ‘max’: 100, ‘step_size’: 5, ‘num_bins’: 21, ‘bins’: [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100], ‘num_nodes’: [0, 0, 282, 169, 191, 103, 81, 81, 81, 58, 24, 49, 41, 17, 12, 24, 16, 18, 14, 13, 3]} Related to Literature - Algorithm (Dataset tag relations) 1: Drawing Large Graphs with a Potential-Field-Based Multilevel Algorithm (../Benchmark%20sets%200cc6b5e454304aec98f3b59b1a720476/Literature%20ad87f14e7097454fb2f784e2c8a2797a/Literature%20-%20Algorithm%2012e01bfc60a84007aa7d2d34293e123d/Drawing%20Large%20Graphs%20with%20a%20Potential-Field-Based%20%203c0831d7e44545b0894bb5b8a4aa8f54.md), An Experimental Comparison of Fast Algorithms for Drawing General Large Graphs (../Benchmark%20sets%200cc6b5e454304aec98f3b59b1a720476/Literature%20ad87f14e7097454fb2f784e2c8a2797a/Literature%20-%20Algorithm%2012e01bfc60a84007aa7d2d34293e123d/An%20Experimental%20Comparison%20of%20Fast%20Algorithms%20for%20%20190e5036cf974a879b50614cfff525f1.md), Large-Graph Layout with the Fast Multipole Multilevel Method (../Benchmark%20sets%200cc6b5e454304aec98f3b59b1a720476/Literature%20ad87f14e7097454fb2f784e2c8a2797a/Literature%20-%20Algorithm%2012e01bfc60a84007aa7d2d34293e123d/Large-Graph%20Layout%20with%20the%20Fast%20Multipole%20Multile%20b88c56b7799741ccbbb9d4f05ea8df4b.md), Compact Layered Drawings of General Directed Graphs (../Benchmark%20sets%200cc6b5e454304aec98f3b59b1a720476/Literature%20ad87f14e7097454fb2f784e2c8a2797a/Literature%20-%20Algorithm%2012e01bfc60a84007aa7d2d34293e123d/Compact%20Layered%20Drawings%20of%20General%20Directed%20Graph%209591b47221954ff68fc0758e8e0a8dd8.md), A Flow Formulation for Horizontal Coordinate Assignment with Prescribed Width (../Benchmark%20sets%200cc6b5e454304aec98f3b59b1a720476/Literature%20ad87f14e7097454fb2f784e2c8a2797a/Literature%20-%20Algorithm%2012e01bfc60a84007aa7d2d34293e123d/A%20Flow%20Formulation%20for%20Horizontal%20Coordinate%20Assig%205274be7cd9aa4d90815f8958773c6fa7.md), A Natural Quadratic Approach to the Generalized Graph Layering Problem (../Benchmark%20sets%200cc6b5e454304aec98f3b59b1a720476/Literature%20ad87f14e7097454fb2f784e2c8a2797a/Literature%20-%20Algorithm%2012e01bfc60a84007aa7d2d34293e123d/A%20Natural%20Quadratic%20Approach%20to%20the%20Generalized%20Gr%20dcd688b1f87c44f28511297c3091f86e.md), Advances in the Planarization Method: Effective Multiple Edge Insertions (../Benchmark%20sets%200cc6b5e454304aec98f3b59b1a720476/Literature%20ad87f14e7097454fb2f784e2c8a2797a/Literature%20-%20Algorithm%2012e01bfc60a84007aa7d2d34293e123d/Advances%20in%20the%20Planarization%20Method%20Effective%20Mul%20884c2bc419eb4197be261c1f1b3898ce.md), Simple and Efficient Bilayer Cross Counting (../Benchmark%20sets%200cc6b5e454304aec98f3b59b1a720476/Literature%20ad87f14e7097454fb2f784e2c8a2797a/Literature%20-%20Algorithm%2012e01bfc60a84007aa7d2d34293e123d/Simple%20and%20Efficient%20Bilayer%20Cross%20Counting%20f879285ff264423cb974db4969614248.md), Drawing Directed Acyclic Graphs: An Experimental Study (../Benchmark%20sets%200cc6b5e454304aec98f3b59b1a720476/Literature%20ad87f14e7097454fb2f784e2c8a2797a/Literature%20-%20Algorithm%2012e01bfc60a84007aa7d2d34293e123d/Drawing%20Directed%20Acyclic%20Graphs%20An%20Experimental%20St%201677531652194663b7fdf25025c61cc6.md), Measuring Symmetry in Drawings of Graphs (../Benchmark%20sets%200cc6b5e454304aec98f3b59b1a720476/Literature%20ad87f14e7097454fb2f784e2c8a2797a/Literature%20-%20Algorithm%2012e01bfc60a84007aa7d2d34293e123d/Measuring%20Symmetry%20in%20Drawings%20of%20Graphs%203aac09f89def4584a9cddd63aa0d7efc.md)

2 Body

Statistics

four_in_one.svg

Descriptions from Literature

From “Drawing Directed Acyclic Graphs: An Experimental Study”:

The experimental study was performed on two different sets of DAGs, both with a strong connection to “real-life” applications. We considered two typical contexts where DAGs play a fundamental role, namely software engineering and project planning.

The first set of test DAGs are what we call the North DAGs. They are obtained from a collection of directed graphs [28], that North collected at AT&T Bell Labs by running for two years Draw DAG, an e-mail graph drawing service that accepts directed graphs formatted as e-mail messages and returns messages with the corresponding drawings [27].

Originally, the North DAGs consisted of 5114 directed graphs, whose number of vertices varied in the range 1 … 7602. However, the density of the directed graphs with a number of vertices that did not fall in the range 10 … 100 was very low (see also the statistics in [28]); since such directed graphs represent a very sparse statistical population we decided to discard them. Then we noted that many directed graphs were isomorphic; since the vertices of the directed graphs have labels associated with them, the problem is tractable. For each isomorphism class, we kept only one representative directed graph. Also, we deleted the directed graphs where subgraphs were specified as clusters, to be drawn in their own distinct rectangular region of the layout, because constrained algorithms are beyond the scope of this study. This filtering left us with 1277 directed graphs.

Still, 491 directed graphs were not connected and this was a problem for running algorithms implemented in G D W (they assume input directed graphs to be connected). Instead of discarding the directed graphs, we followed a more practical approach, by randomly adding a minimum set of directed edges that makes each directed graph connected. Finally, we made the directed graph acyclic, where necessary, by applying some heuristics for inverting the direction of a small subset of edges.

We then ran a first set of experiments and produced the statistics by grouping the DAGs by number of vertices. Although the comparison among the algorithms looked consistent (the produced plots never oddly overlapped), each single plot was not satisfactory, because it showed peaks and valleys. We went back to study the test suite and observed that grouping them by number of vertices was not the best approach. In fact, the North DAGs come from very heterogeneous sources, mainly representing different phases of various software engineering projects; as a result, directed graphs with more or less the same number of vertices may be either very dense or very sparse.

Since most of the analyzed quality measures strongly depend on the number of edges of the DAG (e.g. area, number of bends, and number of crossings), we decided that a better approach was to group the DAGs by number of edges. After some tests, we clustered the DAGs into nine groups, each with at least 40 DAGs, so that the number of edges in the DAGs belonging to group i, 1 ≤ i ≤ 9, is in the range 10 i … 10 i+9 (see Fig. 3). The resulting test suite consists of 1158 DAGs, each with edges in the range 10 … 99.

From “Layer-Free Upward Crossing Minimization”:

North DAGs. The North DAGs have been introduced in an experimental comparison of algorithms for drawing DAGs [Di Battista et al. 2000]. The benchmark set contains 1,158 DAGs collected by Stephen North, which were slightly modified by Di Battista et al. The graphs are grouped into nine sets, where set i contains graphs with 10 i to 10 i+9 arcs for i=1, …, 9.

Example Figures

From “Drawing Directed Acyclic Graphs: An Experimental Study”:

Untitled

Untitled

From “A Natural Quadratic Approach to the Generalized Graph Layering Problem”:

Screen Shot 2023-08-09 at 4.26.57 PM.png

From “A Flow Formulation for Horizontal Coordinate Assignment with Prescribed Width”:

Screen Shot 2023-08-09 at 4.24.19 PM.png